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Abstract 

Probability-like parameters appearing in some statistical models, and their 
prior distributions, are reinterpreted through the notion of 'circumstance', a 
term which stands for any piece of knowledge that is useful in assigning a prob- 
ability and that satisfies some additional logical properties. The idea, which 
can be traced to Laplace and Jaynes, is that the usual inferential reasonings 
about the probability-like parameters of a statistical model can be conceived as 
reasonings about equivalence classes of 'circumstances' — viz., real or hypo- 
thetical pieces of knowledge, like e.g. physical hypotheses, that are useful in 
assigning a probability and satisfy some additional logical properties — that 
are uniquely indexed by the probability distributions they lead to. 
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If you can 't join 'em, 
join 'em together. 



Introduction 

In the present first study we offer an alternative point of view on, or a re-interpretation 
of, probability-like parameters and 'probabilities of probabilities', two objects that 
appear in connexion with statistical models. This also provides a re-interpretation of 
some kinds of inverse methods, for which we develop a simple and general logical 
framework. This point of view, which we think is basically Laplace's but uses an 
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idea presented in nuce in some work by Jaynes, is alternative to both that based on 
the distinction between 'physical' and 'subjective' probabilities, and that based on 
de Finetti's theorem. This point of view and the ensuing inverse-method framework 
have applications in physical theories, as will be shown in following studies ^. 

Our study and results are based exclusively on probability theory; we do not use 
entropy notions, for example. For us, 'probability' means simply plausibility, and 
the following conceptual proportion holds:' 

Plausibility calculus : Everyday notion of 'plausibility' = 

Logical calculus : Everyday notion of 'truth'. 

We thus take the licence to adopt the term plausibility henceforth^ — with no need to 
define what it means, any more than it is usually done with 'truth'. ^ (Our study can 
in any case be easily 'translated' into degrees-of-belief, credence, or similar terms.) 

The notation P(A| /) [| 0, |,|, |l§ ^ , |l||l3Kg, |l5], [T|, |l7|, |l|, |T|, |l], ||, 
23]1 (cf. also [0, ^ ^ ^ ^ ^ |1|, 1^, H, g, |35|, |^]) will denote the 
plausibility of the statement A in the context described by the proposition /.^ We shall 
also say that / 'leads to' or 'yields' a given plausibility of A, but no particular meaning 
is intended with these two verbs. Associated plausibility densities will be denoted by 
^ Pa{A l)\ the term 'distribution' will be used for 'density' sometimes. Other 
symbols and notations are used in accordance with ISO [ p4| ] and ANSI/IEEE [^5\ 
standards. 



In another study 0460 we analyse the question of assigning plausibilities to un- 
known 'events' (e.g., measurement outcomes) from knowledge of 'similar events'; a 



'Does also Wittgenstein mean something of the kind when he writes 'Probability theory is only 
concerned with the state of expectation in the sense in which logic is with thinking' § 237] 
p. 231]? See also Johnson [5, pp. 2-3]. 

^Also used by Kordig \ t]. 

^Is truth objective or subjective? Can truth be 'operationally' defined? Can the truth of a proposi- 
tion be tested? — 'Of course! to test the truth of 'This hat is brown' I only need to look at the hat!'. 
Well, provided e.g. that you are not dreaming or having hallucinations; and how do you test thatl Going 
backwards, in the end you arrive at some proposition which you simply assume — unconsciously, by 
convention, by agreement, by caprice — and cannot 'test'. 'Subjectivity' lurks no less in logic than in 
plausibility theory — and is no less uninteresting in plausibility theory than it is in logic. 

''The context could also be called 'condition' or 'situation' ; Johnson calls it 'supposal' |^ ; Jeffreys 
simply 'data' [|l7|. In the notation above, there is a relation (in the sense in which there is a relation 
between 'certain' and 'true') between the expressions 'P(A| /) = 1' and '/ |= A' (especially when 
this is used as in situational logic; see e.g. [^, ^ 0])- The latter could also be suggestively 
written 'T(A| 7) = 1 ' . The differences between the formalisms of logic and plausibility theory lie 
essentially in the fact that our everyday use of truth can be effectively (but not exclusively) modelled 
through a dichotomic set like jO, 1) (or {T,_L) or jT,F)), whereas our everyday use of plausibility can 
be effectively modelled through an ordered continuum like [0, 1] (or [0, +oo] or [-tx), +oo], see e.g. 
Tribus [ p^ pp. 26-29], Jaynes [^, ch. 2], Cox [|l2|]). This has important implications, like the fact that 
a purely syntactic approach to plausibility, in the guise of the logical calculus, is near to worthless |^^. 
The parallel between plausibility theory and truth-functional logic suggests also another point. We do 
not require of logic, when put to practical use, that it should also provide us with the initial truth-value 
assignments (the 'assumptions'). Why should we have an analogous requirement on plausibility theory 
with regard to initial plausibility-value assignments instead? 
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problem which is connected to induction. The key point is the formalisation, within 
probabiUty theory, of the notion of 'similar event'. This we do through the framework 
and the interpretation presented in the first note. We do not use the idea of exchange- 
ability — and infinite exchangeability in particular — which is used in Bayesian 
theory for the same purpose; but there are known strong connexions and analogies 
with its mathematics and some of its results. In fact, we try to persuade the reader 
that our approach touches the core idea from which exchangeability also springs. 

In a third note [jl], |2|] it will be shown that the inferential point of view presented 
here and in [ |46| ] finds applications in physical theories, like classical and quantum 
mechanics, an example being state reconstruction [0, 48]. The framework pre- 
sented subsumes and re-interprets known techniques of quantum-state assignment (or 
'retrodiction' or 'reconstruction') and tomography, and offers alternative approaches 
to analogous techniques in classical mechanics. 



1 Statistical models and 'probabilities of probabilities' 

A statistical model is, roughly speaking, a plausibility distribution whose numerical 
values depend on parameters (for a critical discussion of more rigorous or useful 



definitions see p9| , p(]|]). An example is the ubiquitous normal distribution 

1 



9 " 

X i-> N(;c| a ) := —— — exp 
VzTtcr 



2cr2 



whose parameters are the expectation \x and the variance a. Another example, one 
in which we shall be especially interested in this paper, is the 'generalised Bernoulli' 
model / Br(/| q), which gives the plausibility distribution for a set of m mutually 
exclusive and exhaustive propositions {R\, . . . ,R,„}, hereafter called outcomes, 

p{Ri\ q) = Br(/| q) := qi, (1) 

depending on a set of parameters q := {qi,. . . ,qm) which belong to a simplex of 
appropriate dimensionality: 

qeA:={{xi)\xi^O,Y,iXi = \]. (2) 

(This model is apparently called 'discrete model'. Since this name is too anonymous 
and the model reduces to the Bernoulli one for m = 2, we opted for 'generalised 
Bernoulli' instead.) 

The parameters of a statistical model are sometimes regarded as 'unknown', and 
a plausibility distribution (more precisely, a density) for them is therefore introduced. 
This distribution, usually called 'prior distribution' or simply 'prior' , is used in cal- 
culations for a variety of purposes; two in particular interest us here and in the fol- 
lowing papers. (1) A parameter-free plausibility distribution for the outcomes can 
be obtained integrating the product of the prior and the statistical model in respect 
of the parameters, i.e., by marginalisation. In the case of the Bernoulli model, e.g., 
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introducing a prior q i-> f{q) (/ being of course a normalised positive generalised 
function) one obtains the parameter-free distribution 



p{Ri) 



-L 



Br(/| q)mdq 



L 



(3) 



(2) In so called 'inverse methods' or 'problems' (cf. Dale [||, §§ 1.2, 1.3]), the 
prior is used in the formula of B ayes' theorem to obtain an 'updated' plausibility 
distribution for the parameters, conditional on knowledge of some outcome. The 
resulting distribution is called 'posterior (distribution)'. In the case of the Bernoulli 
model, e.g., from the prior q i-^ f{q) and knowledge of the outcome /?, one obtains 
the posterior 

pmq)m 



q^fiq\Ri) = 



j^p{Ri\q')f{q')dq'' 



(4) 



Such practices are at least as old as Bayes [52, 53 



Related and unrelated 
historical information can be found e.g. in Dale's book [ |51p and some nice essays 
by Hacking [55, 56, |^]; see also Jaynes' discussion [20, ch. 18]. Old, though ap- 
parently not as old as Bayes, is also the question: how to interpret statistical-model 
parameters like q and their prior distributions? The problem is that the parameters 
(qi) look like plausibilities, since their values are identical to the plausibilities of the 
outcomes {/?,) as eq. ([I]) shows, and that the prior / looks therefore like a 'plausibility 
(distribution) of a plausibility' — a redundant notion. This question, combined with 
the related issues on the interpretation of 'probability', has led to many philosophical 
debates; see e.g., amongst the vast literature on this, [58, 59]. The importance of 
the interpretative question is not merely philosophical, however. Different interpre- 
tations can lead to different conceptual and mathematical approaches — and thus to 
different solutions — in the investigation of concrete problems. This is particularly 
true for elaborate statistical models, like those connected to physical theories. 

Two main interpretations appear to be in vogue. Many statisticians, logicians, 
and physicists, on the one hand, speak about 'subjective' and 'physical' probabilities 
(or 'propensities' []60|]). For them the notion of a 'probability of a probability' poses 
no problems, since it means something like 'the subjective probability of a propen- 
sity'. The very idea of 'estimating a probability' implies such kinds of interpretation; 



cf. e.g. Good [pill, especially the title and chapter 2, Jamison [|62|], or Tintner []63|]. 

For pious Bayesian or 'de Finettian' devotees, on the other hand, which con- 
ceive probability as 'degree of belief, the notion of a 'degree of belief in a degree 
of belief is redundant or even meaningless. The Bayesian are notoriously rescued 
from philosophical headaches by de Finetti's celebrated theorem and other similar 
ones [g, H H, H, |8|, m, ^ 0, ^ ^, 0, ^, |7|, ^ ^], by which parameters 
like q and functions like q i-^ f{q) are introduced as mere mathematical devices — 
not plausibilities or degrees of belief! — that need not be directly interpreted. 



i.e 



See Bernardo and Smith [^, ch. 4] for a neat presentation of this point of view. In- 
terpretative issues like the Bayesian's are also shared by those who thinks in terms of 
'logical probabilities' or, like we, simply in terms of 'plausibilities'. 
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Here we present, discuss, and formalise still another interpretation — let us call 
it the 'circumstance interpretation' for definiteness' sake, for reasons that will be 
apparent in the next section — in which functions like q i-> f{q) do represent plau- 
sibility distributions, i.e., they are not mere mathematical devices, but the notion of 
'plausibility of a plausibility' is nevertheless completely avoided. This interpretation 
combines two ideas by which Jaynes tried to make sense of 'plausibility-like' param- 
eters: one is very briefly formulated in ||8^, p. 11], and can possibly be read also in 

- appears in the var- 
n, ch. 18]. 



- the idea of an 'Ap distribution' 



de Finetti ||8|, § 20]; the other - 

ious versions of his book on probability theory [ ]T8| , lect. 18][]I9|, lect. 5][^ 
A similar interpretation is also proposed and discussed by Mosleh and Bier [59]. 
Caves also discusses, and criticises, a similar idea [|78|, B^]. It really seems to us, 



however, that this interpretation is basically what Laplace had in mind []53|], if we 
read his 'causes' more generally as 'circumstances'. 

Instead of trying to summarise this interpretation in abstract general terms that 
would very likely only appear obscure at this point, we prefer to invite the reader to 
proceed to the simple and concrete example of the next section, just a coin toss away. 
The example will allow us to introduce the basic idea, along with some terminology. 
Then another, more elaborate example (§ |3b follows, to further expand the main idea. 
This is then abstracted and generalised (§ Q). Some important remarks are scattered 
throughout this note. 



2 Interpreting plausibility-like parameters as 'indexed 
circumstances': introductory example 

2.1 Context and circumstances 

A coin has been tossed, the outcome unknown to us. We want to assign plausibilities 
to the outcomes 'head', R\^, and 'tail', R^. The old recipe says to compute "Ze rapport 
du nombre des cas favorables a celui de tons les cas possibles" [|^]. This is seldom 
of much help: Which are the cases? at which depth should the situation be analysed? 
And what if these cases are not equally plausible? 

But why not analyse the situation in terms of some set of 'cases' anyway? Some 
set, not the set. And their plausibilities can be assigned by some other means. We do 
not want the ultimate analysis, just an analysis. 

In our case, suppose that the knowledge of the situation, which constitutes the 
context /co, says that either Cecily or Gwendolen or Jack or Algernon tossed the 
coin. Let us call these the four possible circumstances of the coin toss and denote 
them by Cc, Cq, Cj, C\. The context could thus be analysed as the conjunction 
= Jco A (Cc V Cg V Cj V Ca), for some 'sub-context' /co- 
Each circumstance says also something more about the respective person, which 
helps us in assigning the conditional plausibilities:^ 

^Cf. Laplace's Probleme II 
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Cc'. Cecily is a magician and skilled coin-tosser that always like to produce the 
outcome 'head' . If we knew that she had tossed the coin, we would assign the 
distribution of plausibility 

(P(/?hl Cc A /co), P(/?tl Cc A /co)) = (1,0) (5) 

for the outcomes. 

Cq: Gwendolen, on the other hand, has no such particular skills, so if it were her 
who had tossed the coin we would assign the plausibility distribution 

(P(/?,|CGA/co)) = (^,i), (6) 

with / = h, t here and in the following. 

Cj: On Jack we know nothing whatsoever. He could be skilled or unskilled in 
coin-tossing, a trickster or an absolutely earnest person. If we knew he had 
tossed the coin we could but assign the distribution 

(P(/?,|CjA/co)) = (^,^). (7) 

Ca- Finally, we know that Algernon had been carrying a double-headed coin, which 
he would exchange with the original one if asked to toss it. So we assign the 
plausibilities 

(P(/?,|CAA/eo)) = (l,0) (8) 

in case he had made the coin toss. 



Remark 1. It is clear that not all the circumstances above express 'causes' [53] or 



'mechanisms' IjlS], lects. 16, 17]|119|, lect. 5]||20|, ch. 18]||78|] which 'determine' the 
respective plausibility distributions. It could be appropriate to say this of the circum- 
stance concerning Algernon; but the circumstance concerning Jack, e.g., can hardly 
be called a 'cause' or 'mechanism': it is only out of sheer ignorance that we assign, 
conditionally upon it, the distribution (1/2, 1/2). Here and in the following, 'circum- 
stance' will generally mean simply what its name denotes: 'a possibly unessential or 
secondary condition, detail, part, state of affair, factor, accompaniment, or attribute, 
in respect of time, place, manner, agent, etc., that accompanies, surrounds, or possi- 
bly determines, modifies, or influences a fact or event' (cf. [p^]). □ 



2.2 Grouping the circumstances in a special way 

The crucial step now is the following. Suppose that these four circumstances interest 
us not for their intrinsic details, but only in connexion with the plausibility distribu- 
tions they lead to for the coin toss in the context Ico- In this regard, the circumstance 
'Cecily tossed the coin'^ and the circumstance 'Algernon tossed the coin' are for us 

*Our knowledge about Cecily must also be understood as implicit in this sentence; otherwise we 
should write 'Cecily, who is a magician etc., tossed the coin'. This also holds for the sentences that 
follow. 
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equivalent, since both lead to the plausibility distribution (1,0), as shown by eqs. ^ 
and (j8|). Similarly, 'Gwendolen tossed the coin' and 'Jack tossed the coin' are also 
equivalent, both leading to (1/2, 1/2); cf. eqs. (^ and (^. We should like to have a 
set of circumstances such that different circumstances led to different distributions. 
The first thing that comes to mind is to take the set {Cc V Ca, Cq V Cj} of the dis- 
junctions of equivalent circumstances, i.e. Cc V Ca = 'Cecily or Algernon tossed the 
coin' and Co V Cj = 'Gwendolen or Jack tossed the coin'. We must see, however, 
whether this 'coarse-grained' set really fulfils our wishes. 

A simple theorem of plausibility theory comes to help. It says that, in a given 
context, the plausibility of a statement A conditional on a disjunction of mutually 
exclusive propositions {Bj} is given by a convex sum of the plausibilities conditional 
on the single propositions, as follows [ pO| , ch. 2]: 

P[A| (V,B;) A /] - y P(A| Bj A /)^?^4t7T {[Bj] mutually exclusive), (9) 
^ hi^{Bi\I) 

the weights being proportional to the plausibilities of the {Bj]. Note that the value of 
the plausibility conditional on the disjunction, V{A\{\J jBj) A /], generally depends 
on the values of the plausibilities of the {Bj}, {P(Bji /)}. Thus, the latter plausibilities 
must in general be specified if we want to find the first, and that varies as these vary. 
However, we see that this dependence disappears when the plausibilities conditional 
on each single Bj, |P(A| Bj A /)), have all the same value (the right-hand side be- 
comes a convex sum of identical points). In this case also the plausibihty conditional 
on the disjunction, P[A|(Vyfi;) A /], will have that same value, irrespective of the 
plausibilities of the {Bj}-? 

if P(A| Bj Al) = q for all j, then P[A| (VySy) A /] = ^, 

regardless of the values of the {P(B;i /)}■ (10) 

Clearly this is just the case when A is either of our outcomes {/?,} and the {Bj} 
are either pair of equivalent circumstances. In fact, the protasis of the last formula 
is just our previous definition of equivalence amongst circumstances. Hence, the 
plausibility distribution for the results conditional on the disjunction Cc V Ca is the 
same as those conditional on the two disjuncts separately, 

(P[/?,| (Cc V Ca) a /eo]) - (P(/?,| Cc A /eo)) - (P(/?,| Ca a /eo)) - (1,0), (11) 

and analogously for Cq V Cj: 

(P[/?,| (Co V Cj) a /eo]) - (P(/?,| Cq A hj) = (P(/?,| Cj A ^0)) = (\, \). (12) 

This is true whatever the plausibilities of our four initial circumstances might be (in 
fact, we have not yet specified them!). 

'Cases of vanishing plausibilities can be treated as appropriate limits. One can adopt the consistent 
convention that the product of an undefined plausibility (such as those with a contradictory context) 
times a defined and vanishing one also vanishes. 
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The coarse-grained set [Cq V Ca, Cq V Cj) has thus, by construction, the special 
feature we looked for: different circumstances lead to different plausibility distribu- 
tions for the outcomes. The circumstances can therefore be uniquely indexed by the 
respectively assigned plausibility distributions, and we denote them accordingly: 

5(1,0) :=CcVCa, :=CgVCj, (13) 

and call them plausibility -indexed circumstances. With this indexing system, and 
denoting q := (qh, qt), the conditional plausibilities of the outcomes can be written 

P{Ri\SgAl,,)^qi. (14) 

The last expression is in many ways similar to that defining the generalised 
Bernoulli model ([I]). Indeed, one of our main points is the following: plausibility-like 
parameters used as arguments of plausibilities can always be interpreted to stand for 
appropriate plausibility-indexed circumstances. 

In view of eq. ([T4j), someone could interpret the symbol '5^' as 'The plausibility 
distribution for the {/?,) is q' (similarly to the symbol 'Ap' introduced by Jaynes [18 



lect. 18][[T9|, lect. 5][|^, ch. 18]). But such an interpretation is obviously wrong. Let 
us make this point clear. The symbol '5(1,0)', e-g-> stands for 'Cecily or Algernon 
tossed the coin', as eq. ( [T3| ) shows; and this proposition does not concern plausi- 
bilities at all. It is true that this proposition is the only one leading us to assign 
the distribution (1,0); but it is so just because of a trick, viz. the fact that we have 
grouped and indexed the initial circumstances in a particular way. Borrowing some 
terminology from logic, we can say that the correspondence between the proposition 
'Cecily or Algernon tossed the coin' and the distribution (1, 0) is only a trick within 
the metalanguage of our theory [U, ^ |8^, 

Remark 2. The use of statements like 'The plausibility of A is p' or 'Data are drawn 
from a distribution /' is universal. Of course, they can be simply interpreted as 
'Look, the context and the circumstance are such that the plausibility of A (the data) 
is p (fy, and this can be enough for our purposes: we may not need to know all the 
details of the context and the circumstance. But note that those statements are more 
precisely mefastatements, statements about plausibility assignments. As in logic, the 
use of such kind of statements as arguments of plausibility formulae is preferably 
avoided. First, because such statements usually make poor contexts. Compare the 
statements 'Either Jack, who is a skilled coin tosser with a predilection for 'head', or 
Algernon, who has a two-headed coin, tossed the coin' with 'The plausibility distri- 
bution for 'head' and 'tail' is (1,0)': the former gives some clues as to the grounds 
on which the distribution ( 1 , 0) is assigned, whereas the latter says only that that dis- 
tribution is assigned.*^ Second, because such statements used inside plausibility for- 



^It reminds of Bacheliems' oft quoted answer: "Mihi a docto Doctore/ Domandatur causam et 
rationem, quare/ Opium facit dormire?/ A quoi respondeo,/ Quia est in eo/ Virtus dormitiva./ Cujus est 
natura/ Sensus assoupire" nSSl troisieme intermede]. 
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mulae may give rise to self-references, circularity, and thus known paradoxes (This 
proposition is false') and other inconsistencies 90].^ □ 



2.3 Analysis by marginalisation 

Let us now introduce the plausibilities of the original circumstances in the context 
/co- For concreteness we can assume them to be equally plausible: 

P(Ccl /co) - P(CgI /co) = P(Cj| /co) = P(CaI /co) = 7- (15) 



From these values and the definitions (13) we have by the sum rule the plausibilities 



of the plausibility-indexed circumstances: 

P[5(l,0)|/co] - ^, P['5(l,.)|/co] - ^. (16) 

These plausibilities can be used to write the distribution for the outcomes on 
context /co by marginalisation over the circumstances. We can do this both with the 
initial set {Cy) and with the set of plausibility-indexed set {S q]. With the first we 
obtain 

(P(/?,l /co)) = YJ^m Cj A /co)) P(Cj\ /co) = (I i). (17) 

j 

With the second set we must of course obtain, consistently, the same result; but the 
decomposition has a more suggestive (and possibly misleading!) form: 



(mil /co)) = Y,{^{Ri\Sg A /eo)) nSgl /co), 



(18) 



The index q assumes the two values {(1,0), (1/2, 1/2)}, but we can let it range over 
the whole simplex A defined in (§), introducing a density function q Psi^l ho) in 
the usual way (explained later in § Q). In this case it is given by 

PS iq\ I) := ^ [8(gh - 1) + 6(<?h - ^)] 5(^h + - 1), (19) 



'We find an example in an article by Friedman and Shimony They introduce a proposition 
which says that the expectation of a certain quantity has a given value (their eq. (4)). But such a 
proposition is a metastatement, because expectation is defined in terms of plausibility assignments (in 
contrast to average, which is defined in terms of measured frequencies The authors, however, do 

not notice this and proceed to use that proposition inside plausibilities, obtaining peculiar conclusions. 
Gage and Hestenes [ |93| ] apparently show that these conclusions are not inconsistent, although they do 
not notice the mix-up of language and metalanguage either. Cyranski [ p4| has a partially clearer view 
of the matter. Cf. remark ^. A metastatement inside a plausibility is used, although tentatively, also 
by Jaynes (his 'A,,') [|l| lect. 18]||l|, lect. 5]|^^ ch. 18]; but our analysis shows that his ideas can be 
realised without this artifice. 
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a weighted sum of Dirac deltas'" with support on ^ - (1,0) and (1/2, 1/2). The 



marginalisation ( |18[ ) thus takes the form 

(P(/?;|/co))= r qps{q\Ico)dq, (20) 
which is similar to the formula (|3|) for the generalised Bernoulli model. 

2.4 Updating the plausibility of the circumstances 

If the outcome of the toss is, say, 'head', what do the plausibilities of the circum- 
stances become? In other words, what are the circumstances' plausibilities in the 
context Rh A /co? The answer is obviously given by Bayes' theorem: 

rf-|/?A/l- P(^hl^.A/eo)P(5,|/co) 

"^^^'""'^ ^ - i:,^P(/?h|5,,A/,o)P(5,,|/..)' ^^'^ 

or, in terms of the density ps , 

Ps {q\ A/co) 



J^ql,Psiq'\^Ico)dq' 

2 1 

- 5(^h - 1) + 3 - yj Kqh + qt- D- (22) 

The plausibility of 5'(i^o)> i-C-. that Cecily or Algernon tossed the coin, has thus in- 
creased a little. 



Remark 3. Note that knowledge of the outcome can help to increase the plausibility 
of one of the plausibility-indexed circumstances {^^l at the expense of the others', 
but can never do so within a set of equivalent circumstances like [Cq, Ca) or {Cq, Cj}. 

□ 

The last formula is a very simple instance of the answer to an inverse problem. 
Our point is, again, that the marginalisation over a plausibility-like parameter and 
the updating of its distribution can be interpreted as the same operations for a set 
of plausibility-indexed circumstances. From this standpoint, and as should be clear 
from a previous discussion and remarks, the plausibility P(5'^|/co) (and its density 
Psiqlho)) is not the plausibility of a plausibility, but simply the plausibility of a 
circumstance, the latter being indexed in a particular way. 

3 Second Example: multiple measurements, particular 
convex structures of circumstances, updating 

3.1 Context 

The following example differs from the first in the number of measurements and 
circumstances considered. Consequences: the space of parameters has particular 



"In Egorov's sense |||, ^ ||]; see also g ^ |lO^, |lO^] and cf. |[lo| [lo|, |l04|| 
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convex structures, and the plausibilities of the outcomes of one measurement can be 
'updated' upon knowledge of the outcome of the other. 

We have a box with two buttons, marked 'Letter' and 'Number' , and a display. 
Push the 'Letter' button, and either 'a' or 'b' appears on the display; push 'Number', 
and '1' or '2' appears. We can push each button only once, and only one at a time. 
Call, improperly, measurement the act of pushing a button and reading the display; 
call 'outcome' what is then read on the display. Denote the 'Letter' measurement 
by and its outcomes by {/?a,/?^}; the 'Number' measurement by and its 
outcomes by {E^,E^). 

Given only the above knowledge, we should assign a plausibility distribution 
(1/2, 1/2) to the outcomes of each measurement. But we know in fact something 
more about the construction of the box: inside, besides some sort of machinery, there 
is a chest containing an even number, 2N, of balls. Each ball is marked either 'al', 
'a2', 'bl', or 'b2'. When a button is pushed, the machinery draws one of the balls 
from the chest and sends, depending on the button, either the letter or the number 
printed on the drawn ball to the display; and then puts the ball back into the chest. 
We have also a very important piece of knowledge as to how the 2N balls were 
originally chosen and put into the chest: this initially contained AN balls, marked 
'al', 'a2', 'bl', and 'b2' in equal proportions (i.e., A'^ balls marked 'al', 'a2', etc.). 
From these, 2N balls where taken away, so only 2N remained in the chest. This is 
all we know; denote it (together with everyday knowledge concerning balls, buttons, 
boxes, etc.) by 1^. 

From /at, some points are immediately clear. First, not all the 2N balls in the 
chest can be marked 'al', nor all 'a2', etc., since the chest initially contained only 
of each type. Second, if all the 2A^ balls have the 'a' mark, then of them must 
necessarily be of the 'al' kind and the other A^ must be of the 'a2' kind. Similarly for 
the marks 'b' and, exchanging the role of letters and numbers, '1' and '2'. 

3.2 Introducing a set of circumstances 

Let us analyse the context In into a set of mutually exclusive and exhaustive possible 
circumstances. Different choices are possible. One is to consider the possible sets 
of balls left in (or equivalently, taken away from) the chest. The number of circum- 
stances thus defined is given by the number of ways of choosing 2A^ objects from 
a collection of 4A^ distinct ones without regard to order — the binomial coefficient 



\2n)- Note that it matters which of the 'al' -marked balls are chosen, and likewise 
i^or the others. Our knowledge is symmetric in respect of these circumstances, hence 
they are assigned equal plausibilities.^^ 

"This renders the temporal order of the measurements (if both are performed) irrelevant. That is 
why we are not making temporal considerations. (Note also that we do not need to suppose that the um 
is shaken after the replacement of the ball: this would add nothing to our state of knowledge, since we 
do not know how the machine makes the replacement anyhow.) 

'^That is, they are assigned equal plausibilities not because 'the balls are initially chosen at random' 
or something of the kind, but because we just do not know how they have been chosen. In fact, they 
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Another choice is to consider as a circumstance the numbers of balls marked 'al', 
'a2', etc. left in the chest instead. Note the difference with the previous choice: this 
is a sort of 'coarse graining' thereof. For this reason the newly defined circumstances 
are not equally plausible. We settle for this second choice and denote a generic 
circumstance by Ca;al,(3a2,ybl,^5b2^ meaning 'a 'al'-marked balls, and 8 'b2'- 
marked balls are left in the chest'. The coefficients a, [5 , etc. must obviously sum up 
to 2N and each can range from to N. 

3.3 Plausibility-indexing the circumstances; their particular set 

As in the previous example, suppose that we are not interested in the details of the 
circumstances above, but only in the plausibilities they lead us to assign to the out- 
comes of the two measurements and M^. We can group the circumstances into 
plausibility-indexed equivalence classes, as before. In the present case the equiv- 
alence must take into account two plausibility distributions, one for each measure- 
ment. 

Here is an example for N = 2. The two different circumstances C(2ai,oa2,ibi,ib2) 
and C(iai,ia2,2bi,ob2) lead both to the same plausibility distribution (1/2, 1/2) for the 
'Letter' measurement, and to the same distribution (3/4, 1/4) for the 'Number' mea- 
surement (as is clear by simply counting their 'a's, 'b's, 'I's, and '2's). Moreover, 
only these two circumstances lead to the plausibility distributions above, as the reader 
can prove. By theorem (|l^), also their disjunction C(2ai,0a2,ibi,ib2) V C(iai,ia2,2bi,0b2) 
leads to the same distributions and can thus be denoted by 

5' ((1/2, 1/2), (3/4,1/4)) := C(2al,0a2,lbl,lb2) V C(ial,la2,2bl,0b2)- (23) 

This is one of the plausibility-indexed circumstances. Its plausibility is the sum 
of its disjuncts' plausibilities, P[5 ((1/2,1/2), (3/4,i/4))l ^w] ^ P(C(2ai,0a2,ibi,ib2)l W + 

P(C(lal,la2,2bl,0b2)l ^n)- 

In general, for any A'^, we have plausibility-indexed circumstances denoted by 
S (gL gN), the parameters and corresponding to the plausibility distributions for 
the 'Letter' and the 'Number' measurements. The indexing is such that 

P(/?f I m'' a ^(^l^N) a In) - q\ for A: = L, N and all appropriate /. (24) 

We leave to the reader the pleasure of proving that there is a total of + {N + 1)^ 
plausibility-indexed circumstances, i.e., of distinct values for the parameters {q^, q^). 
They can be represented by points on the plane q]^q^ as illustrated in fig. |l| for the 
cases N = I, N = 4, and N = 16 respectively. It is not difficult to see (especially 
looking at the figure for N = 16) that as A'^ ^ 00 their set becomes dense in the 
convex set Too defined by 

:= {{q^^, q"") \ Wq'^lU + ll^^'lU ^ D, (25) 
can have been chosen according to a particular scheme; the point is that we do not know such scheme. 
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Figure 1: Possible values of the plausibilities and for N = 1, 4, and 16. 
They are in bijective correspondence with the plausibility-indexed circumstances. 
The grey region is the set of all possible pairs of plausibility distributions for two 
generic measurements. 



where H^IU is the supremum norm \\q\\oo '■= max,{^,). Thus in the hmit N ^ oo 
we may effectively work with a continuum of plausibility-indexed circumstances in 
bijection with the points of this set. Denote this 'limit context' by loo- 

We observe two interesting facts. The first is that ^ is a proper subset of the 
set of all possible pairs of plausibility distributions for two generic measurements 
(the grey square region in the figure); the latter is the Cartesian product of two one- 
dimensional simplices, Ai x Ai. We could have let the parameters {q^, q^) range 
over the latter set; in this case, however, the contexts /jv and loo would have led us 
to a vanishing plausibility density for those parameter values not belonging to Foo. 
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The second interesting fact is that neither Too nor the larger set A\ x Ai are simplices. 
It is so because we are considering two measurements (had we considered a single 
measurement with four outcomes, we would have dealt with a three-dimensional 
simplex instead).'^ See also remark |7[ 



3.4 Analysis by marginalisation 

We can write the plausibility distributions for the measurements as marginalisations 
over the plausibility-indexed circumstances ^N)), using the latter's plausibilities 
{P[5'(^L^N)| In]]- Denote for brevity {q^, q^) =: q, hence {^(^l^N)) = {Sq]- Then 

(P(/?f I m'' a /;v)) = (P^^; I M'^AS-qA In)) nS-q\ In), 

qeFN 

- 2] q' P(S-q\lN), 

qeFN 

yt-L,N (26) 

Also this formula, like (|3[) and (18), looks like a weighted sum of plausibilities, and 
PiSq\lN) looks like the plausibility of two plausibility distributions. But this is not 
the case, just as it was not in the example of the coin: the propositions {S q] speak not 
about plausibilities but about possible preparations of the box and its contents; yet 
they are suitably indexed according to the plausibilities they lead us to assign to the 
measurements' outcomes. 

The sums above can also be replaced by integrals over the set Too, 

(P(/?f I m' a In)) = q'ps iq\ In) dq, (27) 

where the density q i-^ psigl Ico) is introduced just like in the example of the coin. 



3.5 'Updating' the plausibilities of the circumstances and the outcomes 

Suppose that the 'Letter' button has been pushed and the outcome 'a' has appeared on 
the display. This knowledge places us in a new context, expressed by the proposition 
A A /. We ask: (1) What plausibilities 

P(Sq\Rl; AM^ aIn) (28) 

do we assign to the plausibility-indexed circumstances {5(^l^N)) in the new context? 
Furthermore, we still have the possibility of pushing the 'Number' button once. So 
we also ask: (2) What plausibilities 

F{Rf\ aR\aM^ A In) (29) 

''You might ask: "Couldn't we consider a single measurement with the four outcomes 'al', 'a2', 
'bl', 'b2' instead?". The answer is: yes, we could have introduced a single /rcftve measurement with 
M'^ and M*^ arising as marginals. But what for? After all, the rules of this game do not make allowance 
for such a measurement. 
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do we now assign to the outcomes {/?^,/?2 1 i^i case we push the 'Number' button? 

Let us answer the first question. We use the assumption (valid in the context Ifj) 
that knowledge of the performance of any of the two measurements (but not of their 
outcomes!) is irrelevant for assigning plausibilities to the circumstances: 

V{S-q\ M*^ A In) = V{Sq\ A A 1^) = V{Sq\ In) for all q, k. (30) 

With this assumption and eq. (p4|), B ayes' theorem yields a simplified form for the 
sought plausibilities ' 



nS-gllN) 



P{S-,\K A A In) = ; (31) 

Yjg' q'^ P{S-g'\lN) 

This is also the answer to an inverse problem. Note, again, that it expresses the 
updated plausibility distribution, not 'of the parameter q\ but of propositions like 
'There are cc 'al '-marked balls, . . . , and S 'b2'-marked balls left in the chest; or ... ; 
or cx' 'al'-marked balls, . . . , and 5 'b2'-marked balls left in the chest'. 

To answer the second question we use, beside assumption (|30|), the following 
fact, which holds in our context In- If we want to determine the plausibility dis- 
tribution for one of the measurements, and we know which particular circumstance 
holds, then for us it is irrelevant to know whether the other measurement has been 
performed, or which outcome it has yielded. For example, if we are interested in 
the plausibilities of the outcomes of the 'Number' measurement, and we know that 
a particular circumstance S g holds (e.g., that in the chest there are two 'al'-marked 
balls, one 'a2'-marked ball, etc.; or one 'al'-marked ball, one 'a2'-marked ball, etc.), 
then knowledge of the outcome of the mere performance of 'Letter' measurement is 
irrelevant. In formulae, 

P[/?f| (R'j am') am'' a Sg a In] - P(/?f| m' am'' AS-g aIn) = 

P{R'}\ m'' A S q A In) for all A:, / ?t A:, and appropriate /, j. (32) 

Analysing eq. (29) in terms of circumstances and using eqs. (^Tj) and (32) we find 



mf\ M-AR-AM-A In) = (33) 

Yjg' q'^ P{S-g'\lN) 



In regard to the assumptions summarised in eqs. (Bw and (63), cf. remark 



4 Generalisation and summary of principal formulae 



The two examples should suffice to give an idea of the interpretation of ^-like pa- 
rameters and of their plausibilities, and of the principal consequences of this inter- 
pretation. The reader could try to make similar analyses for the toy models by Kirk- 
patrick [ [TOSl , [l06| , [l07| ], Spekkens [ [l08| ], or us [ |l09| , [lTo| ]. We shall now present the 
idea in general and abstract terms. Some additional remarks will also be given. 
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4.1 Experiments, outcomes, circumstances 

In the general case we have a context / and a set of m measurements, represented by 
propositions M*^, k = \,. . . ,m. Each measurement has mutually exclusive and 
exhaustive outcomes represented by a set of propositions {/?^). The number of out- 
comes can vary from measurement to measurement, so that / ranges over appropriate 
sets for different k. The index k is omitted when no confusion arises. 

Remark 4. The use of the terms 'measurement' and 'outcome' is only dictated by 
concreteness. The formalism and the discussion presented apply in fact to more gen- 
eral concepts. What we call 'measurement' could be only a ca^wal observation, or 
simply a 'state of affairs' which can present itself in mutually exclusive and exhaus- 
tive 'forms' (the 'outcomes'). The term 'measurement' shall hence be divested here 
of those connotations implying active planning and control, which are not relevant 
to our study. Moreover, a 'measurement' needs not be associated with a point or 
short interval in time or space. It can e.g. be a collection of observations; in this 
case its 'outcomes' are all possible combinations of results from these observations. 
Finally, note that the m measurements are generally different, i.e., they are not neces- 
sarily 'repetitions' of the 'same' measurement — a case that will be discussed in the 
second paper instead. □ 

A set of circumstances {C j} is introduced; these represent a sort of more detailed, 
possible descriptions of the context /, and are mutually exclusive and exhaustive, i.e. 
we know that one and only one of them holds: 

P(C/ A Cj"\I) = for all / and /' /, (34) 
P(V,Cy|/)- 1. (35) 

The plausibilities of the measurements' outcomes conditional on the circum- 
stances, 

P{Ri\ A Cj A /) for all j, k, and appropriate /, (36) 
are assumed to be given. 

Remark 5. The notion of 'circumstance', represented by propositions Cj and later 
also S q, has been further explained in remark |l|. An example of circumstance from 
§ ^is 'Gwendolen tossed the coin'; other examples are 'The temperature during the 
experiment was 25 °C' and the more elaborated 'We studied the density of monodis- 
perse spherical particles in a tall cylindrical tube as a series of external excitations. 



consisting of discrete, vertical shakes or 'taps,' were applied to the container' Qlllp . 
As in the case of 'measurement', a circumstance needs not be related to a single point 
or short interval in space or time. For example, in assigning the plausibility that it 
will rain or has rained in a given place at a given time, a circumstance might consists 
in a specific history of worldwide meteorological conditions under the preceding two 
years. For reasons discussed in remark^, we require that a circumstance be described 
or specified in concrete terms, and metastatements like 'The samples are drawn from 
a distribution /' or 'The plausibility of head is 1/3' are excluded. Finally, the choice 
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of an appropriate set of circumstances, i.e., of the appropriate way and depth to anal- 
yse a particular problem (the context), can only be decided on an individual basis, of 
course. □ 



4.2 Plausibility-indexing the circumstances 

The circumstances are then grouped into equivalence classes. Two circumstances are 
equivalent if they lead to the same plausibility distributions for each measurement 

^ ^ iV{Ri\M' ACj, M) = V{Ri\M^ ACj. M) 

Cf ~ Cj" < (j/j 

I for all k and appropriate /. 

By construction the equivalence classes are in injective correspondence with 
the possible numerical values of the plausibility distributions for the measurements, 
(q^,q^, . . . ). Denote a generic such value by q := (q'^), its equivalence class by q~, 
and membership by Cj e q~ or simply j e q~. We take all disjunctions of equivalent 
circumstances 

5^ := y Cj, (38) 

and call these (in lack of a better name) plausibility-indexed circumstances, short- 
ened to 'circumstances' whenever no confusion is possible. Conditional on such a 
circumstance S q, the plausibilities of the outcomes have numerical values identical 
to its indices: 

P{R'}\M'' ASgAl)^q^, (39) 



a formula that reminds of a generalised Bernoulli model (cf. eq. ([T]) 

Our main belief, already stated in the coin example, is that plausibility-like pa- 
rameters used as arguments of plausibilities can always be interpreted to stand for 
some appropriate plausibility -indexed circumstances}'^ 

The passage to plausibility-grouped circumstances can have two main motiva- 
tions. (1) We can be interested in the plausibilities the circumstances lead to, rather 
than in the latter's intrinsic details. (2) We may want a set of circumstances with 
the property that knowledge of outcomes can increase the plausibility of only one 
circumstance. This is true for the set {Sq}, but not for the set {Cj} in general. In 
fact, knowledge of new outcomes can never lead to a alteration of the ratios of the 
plausibilities of two or more equivalent circumstances. Cf. remark ^ and see |^ 



and [46, § 5.3]. 



'''What constitutes a circumstance is largely a matter of situation, purpose, and personal good taste 
as well. The formalism presented cannot think up the circumstances for us. In § ^ we spoke e.g. about 
different persons' skills in coin-tossing; but other people could speak about different values of the coin's 
'propensity' to come up heads. Perhaps the reason why 'de Finettians' have always felt uneasy about 
plausibility-like parameters and their priors was that these mathematical objects leave room to ideas and 
concepts that are unnecessary or not in good taste (cf. Jaynes [po|], ch. 3, 'Logic versus propensity'). To 
keep off these ideas they partially denied priors their meaning as plausibilities (this has led, fortunately, 
to some very beautiful ideas and theorems |^^). We hope to have shown here and in the next paper 
that there is no need to adopt such extreme measures. 
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Remark 6. Suppose that to each outcome 7?, of some measurement Mf, is associated 
a value r, of some physical quantity, so that it makes sense to speak of the expected 
value^^ of this quantity in a generic context A /: 

(r| Mt A /> := J] n P{Ri\ Mu A J). (40a) 

In our case, the formation of equivalence classes of circumstances can then be made 
with respect to expected values instead of plausibilities, i.e., 

Cj, ~ Cj" ^ (A m'' a Cf A /> - {/\ A Cj" A /> for all k. (40b) 

In this way we obtain a set of expectation-indexed circumstances {•S'^l- Note that 
two different circumstances in such a set (leading hence to different expectations) 
may lead to the same probability distributions for the outcomes; therefore this set is 
not to be confused with, and has not the same applications of, our {Sq}. □ 

Particularly interesting is the space f of the parameters q. Since these correspond 
to numerical values of plausibility distributions for the measurements, f is in general 
a (possibly proper) subset of a Cartesian product of simplices X/t ^^'^^ the simplex 
A^'^^ corresponding to the plausibility distribution for the k\h measurement. 

Remark 7. The features of the subset V will depend on the nature of the circum- 
stances {Cj} (and thus of the {Sq}). In some cases it is simply postulated that some 
kinds of circumstances do not present themselves, and this will delimit the subset 
accordingly. We saw an instance of this in the box example of § ^ in which the set 
r was, for each A'^, a special proper subset rjv of the Cartesian product of two two- 
dimensional simplices (the grey square region in the figures). There are examples 
of physical theories where we postulate (by induction from numerous observations) 
that the set of 'circumstances in which a system is prepared' — often called states 
— is somehow restricted. This also restricts the space of the mathematical objects 
representing these states to particular, non-simplicial (convex) sets. The most notable 
example is quantum theory, in which the set of statistical operators — the mathemat- 
ical objects representing the states — has very strange shapes 1 112 , 113| , 114 , 115| ].^^ 



The set of Gibbs distributions in classical statistical mechanics provides another ex- 
ample. □ 

Remark 8. The plausibility-indexed circumstances need not be parametrised by the 
values of the plausibility distributions {q'^) = q. Other parametrisations can be used 
as long as they are in bijective correspondence with the q one, and some may be 
more useful (cf. [ |116[ ]). Usually, what is relevant is the convex structure of the set of 
parameters T, a point on which we shall return in the third paper. □ 



Which should not be confused with the average ||92||, defined in terms of observed frequencies. 
Cf. footnote ||. 

'*That is, if we represent this set so as to preserve its convex properties, which are the relevant ones 
(see the third note of this series). 
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4.3 Priors and analysis by marginalisation 



If the initial circumstances [Cj] have the plausibility distribution (P(Cj| by the 
sum rule the plausibility-indexed circumstances have distributions 



(41) 



(see also remark 11). 



In terms of the plausibility-indexed circumstances, the plausibility distribution 
for each measurement outcome can be expressed in marginal form as 

P(/?f| A /) - ^ P(/?f| M'' ASgAl) P{Sg\ I), 

qer 



^(S-q\I), 

qer 

^ J_q'^Ps(q\I)dq, 



(42) 



(cf. eq. (||)) where q i-> psiql I) is an appropriate generalised function 117, 118 
|l 19| ] (see also [12C, 121]). The sudden appearance of an integral can be justified (as 



customary) as follows: q becomes a continuous parameter whose range is some set 
r such that conv f Q f Q X/t ^^'^^ (where conv f is the convex hull of f), and we 
introduce a density function q i-^ Ps{q\ 1) such that, for each w Q F (from a suitable 
cr-field of subsets), Ps{q\l)^q ^ ILq'ewnr^iS -q'\I)}'^ 

Note that to obtain the marginal form above it is assumed that knowledge of 
the measurement performed (but not of its outcome!) is irrelevant for assigning the 



plausibilities to the circumstances (cf. eq. (|30|)): 



P{S-q\ M"' A • • • A M*^" A /) = Y-iS-q} I) for all ^, « = 1, . . . m, and {kt\. (43) 



4.4 Updating the plausibilities of circumstances and outcomes 

Upon knowledge of the outcomes [R^^,. ■ ■ of any subset {M*'', . . . , M^"), n ^ 
m, of measurements, the {kf] being all mutually different, the plausibilities of the 
circumstances are updated, with the assumption (^), according to 

q} . . . q," p(5' - 1 /) 

V\S-n\ (R^} A m'^' ) A • • • A (7?*^" A M*^") A 7] = — , ^ , (44) 

Y.-q'erq'X---q'''C'^{^-qAl) 
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No one forbids us to introduce additional imp ossib le fictitious circumstances (Cy) (which may 



involve, e.g., 'centaurs, nectar, ambrosia, fairies' |122|) constructed so as to 'complete' the set of 
plausibility-indexed circumstances, i.e., in such a way that for each q = (^*) 6 T (note the bar!) there is 
always an S ^ — possibly defined in terms of the fictitious {Cf j — for which P(Rj\ A S g A I) = q'f. 
This operation — which is, mark, not necessary — has no importance nor mathematical consequences 
because the fictitious circumstances are impossible, i.e., their plausibilities in the context / are naught, 
and thus terms containing them give no contribution in formulae like (p^) or (|47|). 
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or, in terms of the density ps , 

q^^ • • • q^" p5 (^1 T) 

PS [q\ A m'^' ) A • • • A (Rf' A M*^") A /] = -—f^ . (45) 

^q''ll---q'lpsmi)d-q' 

These formulae are vahd for n ^ 2 only if we assume that, when a circumstance 
is known and we want to assign a plausibility distribution for a measurement, knowl- 
edge of performance of other measurements or of their outcomes is irrelevant (this is 
what Caves calls, in a slightly different context (see the second paper in this series). 



'leai^ning through the pai^ameter' [|78|]): 



P(/?f I EAM''AS-gAl) = P(/?f I AS-qM) 

for all q, where E is any conjunction of any number of mutually (46) 
different {M^'\ and any number of |/?^') (each kt + k). 



Under the assumptions (p3|) and (46), we also obtain, by marginalisation over 
the {Sq}, the plausibility of an outcome given knowledge of outcomes of other 
measurements {M^'} different from M*^: 

P[/?f| {R^} A M'^') A • • • A (/?*" A M*^") A M*^ A /] - lL±Jl Jll^ . (47) 

^q']l-.-q']lPsmi)A-q' 

Remark 9. We should always be careful in assuming and using the conditions sum- 
marised in eqs. (^3]) and (Q, because they in many cases do not hold. An example 
would be provided by the example of the coin toss if we considered other tosses 
made by the same, unknown, person. In the circumstance in which Jack tosses the 
coin, eq. (^6|) would not hold because from the results of other tosses we would 
learn more about Jack's skills in coin-tossing. In fact, even eq. (^ could cease to 
be valid for other tosses, and our set of circumstances would no longer be appropri- 
ate. We discuss similar matters in more detail in the second part of this study. In 
general, also the relations amongst the times or places at which measurements are 
performed can be relevant and thus require a careful analysis. Cf. the examples in 



refs. [105, 106, 107, 108, 109]. □ 



4.5 Further remarks 



Remark 10. A very important point is that the analysis of the context in terms of 
circumstances is far from unique (cf. footnote |l^). Different sets {C'j,}, {C'J,,}, {C'J,'„}, 
etc. of circumstances can be introduced to analyse the context, and from them cor- 
responding sets of plausibihty-indexed circumstances {S'^ \ q € f), {S'^ \ q e T"), 
[S'^' I q € r'"}, etc. can be constructed in the standard way. The circumstances of 
each set have to be mutually exclusive and exhaustive for the present formalism to 
hold, but they need not be exclusive with those of the other sets. For example, in 
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the case of the coin toss (§ ^ we could analyse the context I^o into another set of 
circumstances, say | r € ]-l, 1[) with 

C'- :- 'The mass-centre of the coin lies on the coin's (oriented) axis a (48) 
fraction r/2 of the total width away from the coin centre'. 

The analysis and the construction of the plausibility-indexed circumstances would 
proceed exactly in the same way, apart from possibly different values of their plausi- 
bilities.'^ 

Different sets {C'j,], {C'j„], . . . can also be combined into a single set with circum- 
stances {C ," := C'v A C";„ A • • • }. These will be mutually exclusive and exhaustive 

^ ^ "' J J 

by construction. Again, the corresponding plausibility-indexed set {5^} will ensue in 
the usual way. □ 

Remark 11. In view of the preceding remark it is clear that we can find a meaning for 
a plausibility -par ameter like q in terms of a set of circumstances, but not the meaning, 
because that set is not unique. This also implies that different choices of priors for 
q need not be contradictory, because they can arise as the plausibilities for different 
sets of circumstances. There are, however, some compatibility conditions that the 
plausibility distributions for two or more sets of plausibility-indexed circumstances 
must satisfy (here stated in terms of densities): 

J qllps'{q\l)d-q = j q'^^ Ps"iq\I)dq, 

r T" 

f q\ q% Ps'{q\ Ddq^ J q]l q% ps"{q\ D dq, 
r r" 

r T" 

for all mutually different kf and appropriate it (49) 

(i.e., some of their moments must be equal), where m is the number of measure- 
ments. These conditions arise simply analysing the plausibilities P(/?,| M*^ A /) and 



(47) first by means of one set of circumstances, then by means of the other, equat- 



ing the expressions thus obtained, and applying property (^ (under the assump 



tions (|43|) and (^). □ 

Remark 12. The formalism lends naturally itself also to iteration. One can introduce 
'circumstances of circumstances', etc., i.e. deeper and deeper levels of analysis for 
the context /. What mathematically comes about looks like a hierarchy of 'plausi- 
bilities of plausibilities', 'plausibilities of plausibilities of plausibilities', etc., which 



'^Note that the position of the mass-centre of a coin is not a very important factor in the assignment 
of a plausibility to heads or tails. See Jaynes' discussion [ po[ ch. 10]. 
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Good calls 'probabilities of Type I, II, III', etc. [51 1. Of course, such a cornucopia of 
recursive analyses may be appropriate and useful in some cases, while in others may 
just lead to constipation. □ 

Remark 13. The interpretation here presented may also provide another point of view 
on theories of interval-valued probabilities (see e.g. [123, 124][75, esp. § 3.1] and 
cf. [|6T|, § 2.2] [125]), an in this sense completes or re-inteiprets studies by e.g. Jami- 
son n6|| , Levi Ml |I2|] , Fishburn [|l27l] , Nau [|l2l . □ 



5 Conclusions 

We often have the need to use statistical models with plausibility-like parameters, 
especially in classical and quantum mechanics, and must face the problems of choos- 
ing an suitable parameter space and a plausibility distribution on this space. These 
problems would sometimes be less difficult if the parameters could be given some 
interpretation. 

Some interpret the parameters as 'propensities' or 'physical probabilities'. But 
these concepts do not make sense to us. 

De Finettians say that we should not interpret the parameters, but think in terms 
of infinitely exchangeable sequences instead; the parameters and their priors then 
arise as mathematical devices. But we do not like being forced to think in terms 
of infinite sequences, whose vast majority (oo) of elements must then necessarily be 
fictitious. And there are situations that can be repeated a finite number of times only. 

In addition to this, looking at concrete applications of statistical models it seems 
that behind the parameters we often have 'at the back of our minds' an idea of some 
possible hypotheses — 'circumstances' — that could hold in the context under study, 
e.g. a physical measurement. These circumstances could help us in the assignment 
of plausibilities. And they need not concern 'causes' or 'propensities'; see remarks |l| 
and ^. At the same time, we are sometimes not interested in the intrinsic details of 
such circumstances, but only in the plausibilities that we eventually assign on their 
grounds. 

We have seen in this study that plausibility theory allows us, starting from any set 
{Cj] of circumstances, to form another, 'coarse-grained' set j^^) with the property 
that its circumstances lead each one to a different plausibility distribution. The cir- 
cumstances of this set can then be uniquely indexed by the plausibility distributions 
they lead us to assign. This set, moreover, is invariant with respect to changes in 
the plausibilities of the initial and the coarse-grained sets of circumstances, |P(Cy| /)} 
and |P(5^|/)}. 

This suggests that plausibility-like parameters like q, when used as arguments of 
plausibility formulae, can always be interpreted to stand for some appropriately in- 
dexed circumstances like S q. With mathematical care, this may even hold for param- 
eters of continuous statistical models. Parameter priors like f{q\ I) can consequently 
be interpreted as plausibilities of circumstances ^{S q\l). 
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The study of how these priors are updated when repetitions of 'similar' measure- 
ments occur, and of particular applications to classical and quantum mechanics, are 
developed in the next two papers. 
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